AITopics | Thompson

Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses. Moreover, the expertise needed to integrate this prior knowledge into probabilistic modeling typically limits the application of these models to specialists. Our goal is to build a regression model that can process numerical data and make probabilistic predictions at arbitrary locations, guided by natural language text which describes a user's prior knowledge. Large Language Models (LLMs) provide a useful starting point for designing such a tool since they 1) provide an interface where users can incorporate expert insights in natural language and 2) provide an opportunity for leveraging latent problem-relevant knowledge encoded in LLMs that users may not have themselves. We start by exploring strategies for eliciting explicit, coherent numerical predictive distributions from LLMs. We examine these joint predictive distributions, which we call LLM Processes, over arbitrarily-many quantities in settings such as forecasting, multi-dimensional regression, black-box optimization, and image modeling. We investigate the practical details of prompting to elicit coherent predictive distributions, and demonstrate their effectiveness at regression. Finally, we demonstrate the ability to usefully incorporate text into numerical predictions, improving predictive performance and giving quantitative structure that reflects qualitative descriptions. This lets us begin to explore the rich, grounded hypothesis space that LLMs implicitly encode.

mae, mixtral mae, nll, (14 more...)

arXiv.org Machine Learning

2405.12856

Country:

North America > Canada > Ontario > Toronto (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

AI's Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia

Qadri, Rida, Shelby, Renee, Bennett, Cynthia L., Denton, Emily

arXiv.org Artificial IntelligenceMay-19-2023

This paper presents a community-centered study of cultural limitations of text-to-image (T2I) models in the South Asian context. We theorize these failures using scholarship on dominant media regimes of representations and locate them within participants' reporting of their existing social marginalizations. We thus show how generative AI can reproduce an outsiders gaze for viewing South Asian cultures, shaped by global and regional power inequities. By centering communities as experts and soliciting their perspectives on T2I limitations, our study adds rich nuance into existing evaluative frameworks and deepens our understanding of the culturally-specific ways AI technologies can fail in non-Western and Global South settings. We distill lessons for responsible development of T2I models, recommending concrete pathways forward that can allow for recognition of structural inequalities.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3593013.3594016

2305.11844

Country:

North America > United States > New York > New York County > New York City (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
(24 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine (1.00)
Media > Film (0.67)
Education (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

ParaDRAM: A Cross-Language Toolbox for Parallel High-Performance Delayed-Rejection Adaptive Metropolis Markov Chain Monte Carlo Simulations

Shahmoradi, Amir, Bagheri, Fatemeh

arXiv.org Machine LearningAug-21-2020

We present ParaDRAM, a high-performance Parallel Delayed-Rejection Adaptive Metropolis Markov Chain Monte Carlo software for optimization, sampling, and integration of mathematical objective functions encountered in scientific inference. ParaDRAM is currently accessible from several popular programming languages including C/C++, Fortran, MATLAB, Python and is part of the ParaMonte open-source project with the following principal design goals: 1. full automation of Monte Carlo simulations, 2. interoperability of the core library with as many programming languages as possible, thus, providing a unified Application Programming Interface and Monte Carlo simulation environment across all programming languages, 3. high-performance 4. parallelizability and scalability of simulations from personal laptops to supercomputers, 5. virtually zero-dependence on external libraries, 6. fully-deterministic reproducibility of simulations, 7. automatic comprehensive reporting and post-processing of the simulation results. We present and discuss several novel techniques implemented in ParaDRAM to automatically and dynamically ensure the good-mixing and the diminishing-adaptation of the resulting pseudo-Markov chains from ParaDRAM. We also discuss the implementation of an efficient data storage method used in ParaDRAM that reduces the average memory and storage requirements of the algorithm by, a factor of 4 for simple simulation problems, to an order of magnitude and more for sampling complex high-dimensional mathematical objective functions. Finally, we discuss how the design goals of ParaDRAM can help users readily and efficiently solve a variety of machine learning and scientific inference problems on a wide range of computing platforms.

machine learning, programming language, simulation, (16 more...)

arXiv.org Machine Learning

2008.09589

Country:

North America > United States > Texas > Tarrant County > Arlington (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

Add feedback